Search CORE

11 research outputs found

Adquisición y representación del conocimiento mediante procesamiento del lenguaje natural

Author: Fernández Gavilanes Milagros
Publication venue
Publication date: 01/01/2012
Field of study

[Resumen] Este trabajo introduce un marco para la recuperación de información combinando el procesamiento del lenguaje natural y conocimiento de un dominio, abordando la totalidad del proceso de creación, gestión e interrogación de una colección documental. La perspectiva empleada integra automáticamente conocimiento lingüístico en un modelo formal de representación semántica, directamente manejable por el sistema. Ello permite la construcción de algoritmos que simplifican las tareas de mantenimiento, proporcionan un acceso más flexible al usuario no especializado, y eliminan componentes subjetivas que lleven a comportamientos difícilmente predecibles. La adquisición de conocimientos lingüísticos parte de un análisis de dependencias basado en un formalismo gramatical suavemente dependiente del contexto. Conjugamos de este modo eficacia computacional y potencia expresiva. La interpretación formal de la semántica descansa en la noción de grafo conceptual, sirviendo de base para la representación de la colección y para las consultas que la interrogan. En este contexto, la propuesta resuelve la generación automática de estas representaciones a partir del conocimiento lingüístico adquirido de los textos y constituyen el punto de partida para su indexación. Luego, se utilizan operaciones sobre grafos así como el principio de proyección y generalización para calcular y ordenar las respuestas, de tal manera que se considere la imprecisión intrínseca y el carácter incompleto de la recuperación. Además, el aspecto visual de los grafos permiten la construcción de interfaces de usuario amigables, conciliando precisión e intuición en su gestión. En este punto, la propuesta también engloba un marco de pruebas formales.[Resumo] Este traballo introduce un marco para a recuperación de información combinando procesamento da linguaxe natural e o coñecemento dun dominio, abordando a totalidade do proceso de creación, xestión e interrogación dunha colección documental. A perspectiva empregada integra automáticamente coñecementos lingüísticos nun modelo formal de representación semántica, directamente manexable polo sistema. Isto permite a construción de algoritmos que simplifican as tarefas de mantemento, proporcionan un acceso máis flexible ao usuario non especializado, e eliminan compoñentes subxectivos que levan a comportamentos difícilmente predicibles. A adquisición de coñecementos lingüísticos parte duhna análise de dependencias basada nun formalismo gramatical suavemente dependente do contexto. Conxugamos deste modo eficacia computacional e potencia expresiva. A interpretación formal da semántica descansa na noción de grafo conceptual, servindo de base para a representación da colección e para as consultas que a interrogan. Neste contexto, a proposta resolve a xeración automática destas representacións a partires do coñecemento lingüístico adquirido dos textos e constitúe o punto de partida para a súa indexación. Logo, empréganse operacións sobre grafos así como o principio de proxección e xeneralización para calcular e ordenar as respostas, de tal maneira que se considere a imprecisión intrínseca e o carácter incompleto da recuperación. Ademáis, o aspecto visual dos grafos permiten a construción de interfaces de usuario amigables, conciliando precisión e intuición na súa xestión. Neste punto, a proposta tamén engloba un marco de probas formais.[Abstract] This thesis introduces a framework for information retrieval combining natural language processing and a domain knowledge, dealing with the whole process of creation, management and interrogation of a documental collection. The perspective used integrates automatically linguistic knowledge in a formal model of semantic representation directly manageable by the system. This allows the construction of algorithms that simplify maintenance tasks, provide more flexible access to non-specialist user, and eliminate subjective components that lead to hardly predictable behavior. The linguistic knowledge adquisition starts from a dependency parse based on a midly context-sensitive grammatical formalism. In this way, we combine computational efficiency and expressive power. The formal interpretation of the semantics is based on the notion of conceptual graph, providing a basis for the representation of the collection and for queries that interrogate. In this context, the proposal addresses the automatic generation of these representations from linguistic knowledge acquired from texts and constitute the starting point for indexing. Then operations on graphs are used and the principle of projection and generalization to calculate and manage replies, so that is considered the inherent inaccuracy and incompleteness of the recovery. In addition, the visual aspect of graphs allow the construction of user-friendly interfaces, balancing precision and intuition in management. At this point, the proposal also includes a framework for formal testing

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

A library for automatic natural language generation of Spanish texts

Author: Costa Montenegro Enrique
Fernández Gavilanes Milagros
García Méndez Silvia
González Castaño Francisco Javier
Juncal Martínez Jonathan
Publication venue: Expert Systems with Applications
Publication date: 15/04/2019
Field of study

In this article we present a novel system for natural language generation (nlg) of Spanish sentences from a minimum set of meaningful words (such as nouns, verbs and adjectives) which, unlike other state-of-the-art solutions, performs the nlg task in a fully automatic way, exploiting both knowledge-based and statistical approaches. Relying on its linguistic knowledge of vocabulary and grammar, the system is able to generate complete, coherent and correctly spelled sentences from the main word sets presented by the user. The system, which was designed to be integrable, portable and efficient, can be easily adapted to other languages by design and can feasibly be integrated in a wide range of digital devices. During its development we also created a supplementary lexicon for Spanish, aLexiS, with wide coverage and high precision, as well as syntactic trees from a freely available definite-clause grammar. The resulting nlg library has been evaluated both automatically and manually (annotation). The system can potentially be used in different application domains such as augmentative communication and automatic generation of administrative reports or news.Xunta de Galicia | Ref. ED341D R2016/012Xunta de Galicia | Ref. GRC 2014/046Ministerio de Economía, Industria y Competitividad | Ref. TEC2016-76465-C2-2-

Investigo

Demographic market segmentation on short banking movement descriptions applying Natural Language Processing

Author: Barba Seara Óscar
De Arriba Perez Francisco
Fernández Gavilanes Milagros
García Méndez Silvia
González Castaño Francisco Javier
Publication venue
Publication date: 01/01/2021
Field of study

Xunta de Galicia | Ref. GRC2018/05

Investigo

A System for Automatic English Text Expansion

Author: Costa-Montenegro Enrique
Fernández-Gavilanes Milagros
García-Méndez Silvia
González-Castaño Francisco J.
Juncal-Martínez Jonathan
Reiter Ehud
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2019
Field of study

This work was supported in part by the Mineco, Spain, under Grant TEC2016-76465-C2-2-R, in part by the Xunta de Galicia, Spain, under Grant GRC-2018/53 and Grant ED341D R2016/012, and in part by the University of Vigo Travel Grant to visit the CLAN Research Group, University of Aberdeen, U.K.Peer reviewedPublisher PD

Aberdeen University Research

A system for automatic English text expansion

Author: Costa Montenegro Enrique
Fernández Gavilanes Milagros
García Méndez Silvia
González Castaño Francisco Javier
Juncal Martínez Jonathan
Reiter Ehud
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/12/2022
Field of study

We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, “automatic” means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation.Ministerio de Economía, Industria y Competitividad | Ref. TEC2016-76465-C2-2-RXunta de Galicia | Ref. GRC-2018/53Xunta de Galicia | Ref. ED341D R2016/012University of Aberdee

Investigo

Métodos y técnicas de monitoreo y predicción temprana en los escenarios de riesgos socionaturales

Author: Alonso Rivera Paulino
ARRIAGA RIVERA ARMANDO
Arévalo Mejía Ricardo
Balbuena Medina Alondra
Balderas Plata Miguel Ángel
Baró Suárez José Emilio
Becerril Piña Rocío
Bâ Khalidou M.
Cambranis Marín Rafael Humberto
Campos Vargas María Milagros
CANCHOLA PANTOJA YERED GYBRAM
Cardona Parra César Augusto
Carreto Bernal Fernando
Castelán Pescina Gilberto
Domech González Armando Antonio
Dávila Hernández Norma Angélica
Díaz Delgado Carlos
Díaz Sánchez Srahyrlandy Rocío
Escalona Hernández José Luis
Espinosa Rodríguez Luis Miguel
Estrada Velázquez Cristina
Febles Díaz José Miguel
Febles González José Manuel
Fernández Córdoba Jhonattan
Garatachía Ramírez Juan Carlos
García Millán Nathalie
García Reyna Miguel Eduardo
GAVILANES RUIZ JUAN CARLOS
Guevara Ortiz Enrique
Gutiérrez Cedillo Jesús Gastón
Gómez Albores Miguel Ángel
HERNÁNDEZ CÉSAR EDGAR DANIEL
Hernández Santana José Ramón
Jaimes Viera María del Carmen
López del Campo Rubén
Martínez Tapia Miguel
Mastachi-Loza Carlos Alberto
Mier y Terán Suárez Jorge
Monroy Gaytán Francisco
Méndez Linares Ana Patricia
NIETO TORRES AMIEL
Olivera Guadarrama Ruggiero
Ordaz Hernández Alexis
Pérez Pérez Anaid
Reyes López Alonso
Rivera Cano Luz Elena
Rodríguez Soto Clarita
Rodríguez Van Gort Frances
Ruiz Velázquez Mario Álvaro
Salcedo Hurtada Elkin de Jesús
Salinas Tapia Humberto
Serrano Barquín Rebeca Angélica
Soto Romero Martín Pánfilo
Torres Maya Adrián
Varley Nick
Vilchis Francés Aleida Yadira
Vélez Correa Jorge Andrés
Zepeda Mondragón Francisco
Publication venue: Universidad Autónoma del Estado de México, AM Editores
Publication date: 01/09/2021
Field of study

Esta obra concentra los métodos y las técnicas fundamentales para el seguimiento y monitoreo de las dinámicas de los escenarios de riesgos socionaturales (geológicos e hidrometeorológicos) y tiene como objetivo general orientar, apoyar y acompañar a los directivos y operativos de protección civil en aterrizar las acciones y políticas públicas enfocadas a la gestión del riesgo local de desastre

Repositorio Institucional de la Universidad Autónoma del Estado de México

De la adquisición del conocimiento a la recuperación de información

Author: Carrera Carrera Sara
Fernández Gavilanes Milagros
Vilares Ferro Manuel
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2008
Field of study

Introducimos una propuesta en recuperación de información basada en la consideración de recursos sintácticos y semánticos complejos y automáticamente generados a partir de la propia colección documental. Se describe una estrategia donde el lenguaje y el dominio de documentos son independientes del proceso.We introduce a proposal on information recovery based on the consideration of complex syntactic and semantic resources which are automatically generated from the documentary collection itself. The paper describes a strategy where the language and the domain of documents are independent of the process.Work partially supported by the Spanish Government from research projects TIN2004-07246- C03-01 and HUM2007-66607-C04-02, and by the Autonomous Government of Galicia from projects PGIDIT05PXIC30501PN, 07SIN005206PR and the Galician Network for NLP and IR

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Identifying banking transaction descriptions via support vector machine short-text classification based on a specialized labelled corpus

Author: Barba Seara Óscar
Fernández Gavilanes Milagros
García Méndez Silvia
González Castaño Francisco Javier
Juncal Martínez Jonathan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2022
Field of study

Short texts are omnipresent in real-time news, social network commentaries, etc. Traditional text representation methods have been successfully applied to self-contained documents of medium size. However, information in short texts is often insufficient, due, for example, to the use of mnemonics, which makes them hard to classify. Therefore, the particularities of specific domains must be exploited. In this article we describe a novel system that combines Natural Language Processing techniques with Machine Learning algorithms to classify banking transaction descriptions for personal finance management, a problem that was not previously considered in the literature. We trained and tested that system on a labelled dataset with real customer transactions that will be available to other researchers on request. Motivated by existing solutions in spam detection, we also propose a short text similarity detector to reduce training set size based on the Jaccard distance. Experimental results with a two-stage classifier combining this detector with a SVM indicate a high accuracy in comparison with alternative approaches, taking into account complexity and computing time. Finally, we present a use case with a personal finance application, CoinScrap, which is available at "Google Play" and "App Store".Ministerio de Economía, Industria y Competitividad | Ref. TEC2016-76465-C2-2-RXunta de Galicia | Ref. GRC2018/053Xunta de Galicia | Ref. ED341D-R2016/01

Investigo

Differentiating users by language and location estimation in sentiment analisys of informal text during major public events

Crossref